4.6. Decrypting Data
Decrypting is the exact inverse of the encryption process, and the code is very
similar to that used for encryption. In short, the following are the two
steps for decryption:
Separate Ksym in the
encrypted archive and decrypt Ksym using Kenc.
Decrypt data by using AES-256 in CBC mode and using Ksym as the
key.
Example 7 shows the code to do
this.
Example 7. Decrypting data
def aes256_decrypt_data(ciphertext, key, iv): """ Decryption and unpadding using AES256-CBC and the specified key and IV."""
dec =0
cipher = EVP.Cipher('aes_256_cbc', key, iv, dec, 0) pbuf = cStringIO.StringIO() cbuf = cStringIO.StringIO(ciphertext)
plaintext = aes256_stream_helper(cipher, cbuf, pbuf) pbuf.close() cbuf.close() return plaintext
def block_decrypt(ciphertext, key): """ High level decryption function. Assumes IV is of block size and precedes the actual ciphertext""" iv = ciphertext[:32] ciphertext = ciphertext[32:]
def decrypt_rsa(rsa_key, ciphertext): return rsa_key.private_decrypt(ciphertext, RSA.pkcs1_padding)
|
4.7. Signing and Validating Data
You now have encrypted data. The only cryptographic piece that is
missing is the ability to sign the archive and validate this signature
to catch tampering of any form.
In ancient days, this was done with a wax seal. The sender would
pour molten wax over the letter, affix his emblem on the molten wax, and
let it cool. The idea here was that no one else had access to the same
seal (authenticity), and that any tampering could be easily detected
(integrity).
Note: It is difficult to believe that some enterprising craftsman
couldn’t replicate the seal of the King of England or other such
royalty. But then, secrecy and cryptography in the Middle Ages worked
in mysterious ways, and the real security might have come from the
messenger carrying the letter, rather than the unbroken seal
itself.
The signing RSA key (Ksign) can be used to
achieve the exact same goals.
Note: This is different from the key pair you used for encryption
(Kenc).
Though an actual attack would be difficult, reusing the encryption key
is considered a bad practice in RSA, since the math for encryption is
the exact inverse of what happens when signing, effectively killing
security.
Signing is very simple. The data to be signed is hashed using a
cryptographically strong hash algorithm, and the hash is then encrypted
with the private key to create the signature. The receiver computes a
hash using the same algorithm, decrypts the signature using the public
key of the sender, and checks whether the hashes compute.
In this case, the receiver is the user trying to restore a backup
in the future. Any changes to the data would cause a change in the hash,
and could be used to detect integrity failure. Since only the user knows
the private key, an attacker cannot attach a signature, thus providing
authenticity.
Example 8 shows the code for
signing and verification. The code is straightforward because
M2Crypto/OpenSSL does all the heavy lifting for you.
Example 8. Signing and verifying data
def sign_rsa(rsa_key, data): """ Expects an RSA key pair. Signs with SHA-256. Would like to use RSASSA-PSS but only dev versions of M2Crypto support that"""
digest = hashlib.sha256(data).digest() return rsa_key.sign(digest, 'sha256')
def verify_rsa(rsa_key, data, sig): """ Verifies a signature"""
digest = hashlib.sha256(data).digest() return rsa_key.verify(digest, sig, 'sha256')==1
|
Note: The only tricky bit is that M2Crypto expects the caller to hash
the data before calling it. Though a type parameter is passed in with the hash
algorithm used (SHA256, in this
case), this parameter is not used by OpenSSL to hash the data, but
rather to determine digest information before padding is added.
4.8. Putting the Cryptography Together
Now you need some “glue” code to put all the pieces together.
Example 9 shows the code to create an encrypted archive given an
input directory or file. It uses all the code you’ve seen so far to
create a tar+gzip file, encrypt that file using a temporary session key,
and then encrypt and append the session key itself. A signature
is added at the end, and the encrypted archive is ready to
be pushed to the cloud.
Example 9. Decrypting an encrypted archive
def extract_encrypted_archive(archive_name, keys):
#Load archive. Separate into encrypted AES key, plaintext sig of # encrypted data and encrypted archive itself
enc_file = open(archive_name, "rb") enc_file_bytes = enc_file.read() enc_file.close()
enc_aes_key = enc_file_bytes[0:256] rsa_sig = enc_file_bytes[256:512] enc_data = enc_file_bytes[512:]
rsa_sign_key = keys[crypto.SIGNING_KEY] rsa_enc_key = keys[crypto.ENCRYPTION_KEY]
# Check the signature in the file to see whether it matches the # encrypted data. We do Encrypt-Then-Mac here so that # we avoid decryption if not crypto.verify_rsa(rsa_sign_key, enc_data, rsa_sig): print "Signature verification failure. Corrupt or tampered archive!" return
# Decrypt the AES key and then decrypt the # encrypted archive using the decrypted AES key aes_key = crypto.decrypt_rsa(rsa_enc_key, enc_aes_key) decrypted_archive_bytes = crypto.block_decrypt(enc_data, aes_key)
# Write a temporary file and then extract the contents to the # current directory
[os_handle,temp_file] = tempfile.mkstemp() temp_file_handle = os.fdopen(os_handle, 'wb') temp_file_handle.write(decrypted_archive_bytes) temp_file_handle.close() extract_tar_gzip(temp_file, ".") os.remove(temp_file)
|
The code shown in Example 12-9 takes the encrypted
archive data and appends a signature (a MAC, in
cryptographic parlance) computed over the encrypted data, as opposed to the plaintext version. In the
cryptographic world, opinion is split on whether this is a better way
of doing this, as opposed to computing a signature over the plaintext
and encrypting both the signature and the plaintext. In reality, both techniques are valid, and it comes down to an
individual preference. The book Practical
Cryptography leans toward computing a signature over
plaintext. Doing it the other way enables you to instantly detect
whether an archive is valid without having to decrypt anything. You
can find a summary of the arguments for doing encrypt-then-MAC at
http://www.daemonology.net/blog/2009-06-24-encrypt-then-mac.html. |
The code in Example 9 is
a bit longer than it needs to be, since a bug in Python’s tarfile module prevents you from dealing with
tar+gzip files completely in memory. The data must be temporarily
written to disk, and then read back into memory, where it is then
encrypted and signed using the code already seen:
def create_encrypted_archive(directory_or_file, archive_name, keys):
# First, let's tar and gzip the file/folder we're given to
# the temp directory. This is a roundabout way of getting the tar+gzipped
# data into memory due to a bug in tarfile with dealing with StringIO
tempdir = tempfile.gettempdir() + "/"
temp_file = tempdir + archive_name + ".tar.gz"
generate_tar_gzip(directory_or_file, temp_file)
gzip_file_handle = open(temp_file,"rb")
gzip_data = gzip_file_handle.read()
gzip_file_handle.close()
os.remove(temp_file) #We don't need source tar gzip file
#Generate a session AES-256 key and encrypt gzipped archive with it
aes_key = crypto.generate_rand_bits(256)
encrypted_archive = crypto.block_encrypt( gzip_data, aes_key)
# Encrypt Ksym (session key) with RSA key (Kenc)
rsa_enc_key = keys[crypto.ENCRYPTION_KEY]
aes_key_enc = crypto.encrypt_rsa(rsa_enc_key, aes_key) #256 bytes
# Sign encrypted data
# There's much debate regarding in which order you encrypt and sign/mac.
# I prefer this way since this lets us not have to decrypt anything
# when the signature is invalid
# See http://www.daemonology.net/blog/2009-06-24-encrypt-then-mac.html
rsa_sign_key = keys[crypto.SIGNING_KEY]
rsa_sig = crypto.sign_rsa(rsa_sign_key, encrypted_archive) #256 bytes
# Append encrypted aes key, signature and archive in that order
return aes_key_enc + rsa_sig + encrypted_archive
Extracting an encrypted archive is the exact reverse. The code
first splits apart the encrypted AES session key (Ksym) and the signature
from the encrypted data. This code relies on the fact that the lengths
of these are well known. If different sizes were used in the future,
this trivial method of parsing out the various parts of the file
wouldn’t work.The code then checks and verifies the signature and exits if the
signature isn’t valid. If the signature is valid, the encrypted data is
decrypted after decrypting the AES session key. Since the decrypted data
is nothing but a gzipped, tarred archive, you wrap up by extracting the
archive to the current working directory.